In computer science, instruction selection is the stage of a compiler backend that transforms its tree-based middle-level intermediate representation (IR) into a low-level IR very close to its final target language. In a typical compiler, it precedes both instruction scheduling and register allocation, so its output IR has an infinite set of pseudoregisters and may still be subject to peephole optimization; otherwise, it closely resembles the target machine code, bytecode, or assembly language. It works by "covering" the intermediate representation with as few tiles as possible. A tile is a template that matches a portion of the IR tree and can be implemented with a single target instruction.
A basic approach in instruction selection is to use some templates for translation of each instruction in an intermediate representation. But naïve use of templates leads to inefficient code in general. Additional attention needs to be paid to avoid duplicated memory access by reordering and merging instructions and promote the usage of registers.
For example, see the following sequence of intermediate instructions:
t1 = a t2 = b t3 = t1 + t2 a = t3 b = t1
A good tiling for the x86 architecture is a succinct set of instructions:
XCHG EAX, b ADD a, EAX
Typically, instruction selection is implemented with a backwards dynamic programming algorithm which computes the "optimal" tiling for each point starting from the end of the program and based from there. Instruction selection can also be implemented with a greedy algorithm that chooses a local optimum at each step.
The code that performs instruction selection is usually automatically generated from a list of valid patterns. Various generator programs differ in the amount of analysis that they perform while they run, rather during the compiler's instruction selection phase. An example of generator program for instruction selection is BURG.
The lowest common denominator strategy is an instruction selection technique used on platforms where Processor Supplementary Instructions exist to make executable programs portable across a wide range of computers. Under a lowest common denominator strategy, the default behaviour of the compiler is to build for the lowest common architecture. Use of any available processor extension is switched off by default, unless explicitly switched on by command line switches.
The use of a lowest common denominator strategy means that Processor Supplementary Instructions and Processor Supplementary Capabilities features are not used by default.